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(57) Abstract: Method and arrangement to detect a picture repetition mode of film material with a series of consecutive fields, the 
arrangement having processing means and a memory (M), the processing means being arranged to carry out the following steps: 
identifying a plurality of different objects within the consecutive fields using a segmentation method, an object being defined as 
an image portion of the consecutive fields that can be described with a single motion model; and carrying out for each one of the 
plurality of objects, the steps of establishing a motion parameter pattern for each one of the objects within the consecutive fields; 
comparing the motion parameter pattern with a number of predetermined motion parameter patterns; and determining the picture 
repetition mode for each one of the objects using the result of the preceding step. 
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Recognizing film and video objects occuring in parallel in single television signal fields 



FIELD OF THE INVENTION 

The present invention relates to the field of detecting motion picture film 
sources in film material. 

5 

PRIOR ART 

In US- A-5, 734,73 5 a method and system is described that analyses a series of 
video images. The types of production media used to produce these video images are 

1 0 detected. Each of the series of video images is segmented into a series of cells in order to 
retain spatial information. The spatial information is used to detect the type of production 
media. No technique is disclosed to detect types of production for different scenes within one 
image, coming from different sources and being mixed to form the single image. 

US-A-6,014,182 also relates to methods for detecting motion picture film 

15 sources. Such a detection might be useful in several environments, like, line doublers, 

television standard converters, television slow motion processing and video compression. For 
instance, a 60 Hz NTSC television signal has a 24 frame/second motion picture film as its 
source. In such a scheme, a 3-2 pull-down ratio is used, i.e., three video fields come from one 
film frame whereas the next two video fields come from the next film frame, etc. E.g., calling 

20 subsequent video fields A, B, C, D, E, a 3-2 pull-down ratio would look like 

AAABBCCCDDEEE. Other sources have a 2-2 pull down ratio or relate to video camera, as 
is known to persons skilled in the art. Thus, comparing successive fields yields information 
about the motion picture source used. 

US-A-5,365,280 proposes to use different motion vectors for different fields 

25 and to generate a picture signal processing mode control signal that can be used by a 

television receiver as an indication that the fields relate either to movie-film or non-movie- 
film. 

Motion estimation algorithms can be found in M.Tekalp, "Digital Video 
Processing", Prentice Hall, ISBN 0-13-190075-7. An overview of object based motion 
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estimation methods is given by Paolo Vicari, "Representation and regularization of motion 
fields with region-based models", thesis for the Politecnico di Milano, no. 598034. 

SUMMARY OF THE INVENTION 

5 

So far, the prior has concentrated on detecting motion picture sources of either 
films having fields originating from a single motion picture source or films having 
subsequent fields originating from two or more different motion picture sources. However, an 
increasing number of films comprise mixtures of images within fields that originate from 

1 0 different motion picture sources. None of the prior art methods discussed above, are able to 
detect the picture repetition mode of individual images within fields of a film. For instance, 
in applications in picture rate conversion, however, an indication of the origin of the 
individual images within the fields is to be known. More particularly, it is necessary to know 
whether the video originates from film material to optimally perform de-interlacing and film 

15 judder removal. 

Therefore, it is an objective of the present invention to provide an apparatus 
and a method allowing to detect the picture repetition mode of individual objects within 
fields. In this context, an "object" may be a portion of an individual image in a field. An 
"object 3 9 is defined as an image portion that can be described with a single motion model. 

20 Such an "object" need not necessarily comprise one "physical" object, like a picture of one 
person. An object may well relate to more than one physical object, e.g., a person sitting on a 
bike where the movement of the person and the bike, essentially, can be described with the 
same motion model. On the other hand, one can safely assume that objects identified in this 
way belong to one single image originating from one single film source. 

25 To obtain the objective of the present invention, it provides a method to detect 

a picture repetition mode of film material comprising a series of consecutive fields, the 
method comprising the following steps: 

> Establishing a motion parameter pattern for the film material; 

> Comparing the pattern with a number of predetermined motion parameter patterns; 
30 > Determining the picture repetition mode using the result of the preceding step; 

characterized in that, the method includes the following steps: 

• Identifying a plurality of different objects within the consecutive fields, an object being 
defined as an image portion of the consecutive fields that can be described with a single 
motion model; 
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• Carrying out the following steps: 

> Establishing a motion parameter pattern for each one of the objects within the 
consecutive fields; 

> Comparing the motion parameter pattern with a number of predetermined motion 
5 parameter patterns; 

> Determining the picture repetition mode for each one of the objects using the result of 
the preceding step. 

Thus, in accordance with the present invention, prior to detecting a film mode, 
1 0 the fields of the television signal are separated into different objects by means of a 

segmentation technique. Any known technique to do so might be used for that purpose. Then, 
the film mode of each individual object is detected. Any known film mode detection 
technique might be used for that purpose. 

Preferably, a motion parameter estimation technique is used as well. 
1 5 So far, as the inventors are aware of, nobody has tried to use the technique of 

motion parameter estimation to identify different image portions (objects) originating from 
different sources because of mixing. 

The invention also relates to an arrangement to detect a picture repetition 
mode of film material comprising a series of consecutive fields, the arrangement comprising 
20 processing means and a memory, the processing means being arranged to carry out the 
following steps: 

> Establishing a motion parameter pattern for the film material; 

> Comparing the pattern with a number of predetermined motion parameter patterns stored 
in the memory; 

25 > Determining the picture repetition mode using the result of the preceding step; 

characterized in that, the processing means are arranged to carry out the following steps: 

• Identifying a plurality of different objects within the consecutive fields, an object being 
defined as an image portion of the consecutive fields that can be described with a single 
motion model; 

30 • Carrying out the following steps: 

> Establishing a motion parameter pattern for each one of the obj ects within the 
consecutive fields; 
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> Comparing the motion parameter pattern with a number of predetermined motion 
parameter patterns stored in the memory; 

> Determining the picture repetition mode for each one of the objects using the result of 
the preceding step. 

5 

Such an arrangement may, advantageously, be implemented on a chip. A 
television comprising such a chip, as well as the chip itself, are also claimed in this invention. 

The invention also relates to a computer program product to be loaded by a 
computer arrangement, comprising instructions to detect a picture repetition mode of film 
10 material comprising a series of consecutive fields, the arrangement comprising processing 
means and a memory, the computer program product, after being loaded, providing the 
processing means with the capability to carry out the following steps: 

> Establishing a motion parameter pattern for the film material; 

> Comparing the pattern with a number of predetermined motion parameter patterns stored 
15 in the memory; 

> Determining the picture repetition mode using the result of the preceding step; 
characterized in that, the processing means are arranged to carry out the following steps: 

> • Identifying a plurality of different objects within the consecutive fields, an object being 
defined as an image portion of the consecutive fields that can be described with a single 
20 motion model; 

• Carrying out the following steps: 

> Establishing a motion parameter pattern for each one of the objects within the 
consecutive fields; 

> Comparing the motion parameter pattern with a number of predetermined motion 
25 parameter patterns stored in the memory; 

> Determining the picture repetition mode for each one of the objects using the result of 
the preceding step. 



30 BRIEF DESCRIPTION OF THE DRAWINGS 



The invention will now be explained with reference to some drawings that are 
only intended to illustrate the present invention and not to limit its scope. The scope is only 
limited by the annexed claims. 



WO 02/056597 - PCT7IB02/00019 

5 

Figure 1 shows a block diagram of a multiple parameter estimator and 
segmentation arrangement 

Figures 2A, 2B, 2C, 2D show television screen photographs illustrating a 
process of selecting points of interest on which parameter estimators optimise their 
5 parameters. 

Figures 3A, 3B, 3C, 3D show television screen photographs illustrating a 
process of segmentation. 

DESCRIPTION OF THE PREFERRED EMBODIMENT 

10 

Introduction 

Hereinafter, a method to detect a film mode of individual objects in a scene is 
proposed. To that end, first of all, a method is described to identify individual objects in a 

15 scene. Individual objects are identified by motion estimation, i.e., those portions of a scene 
that can be described with a same motion model are identified as belonging to a same object 
in the scene. Motion estimators are known as such from the prior art, e.g., from [1], [3], [4], ■ 
[5], and [6]. Of these references, [1] describes a motion estimator allowing to identify objects 
in a scene without the need to apply an image segmentation. 

20 For the present invention, a motion estimator is preferred that is designed to be 

suitable for picture rate conversion, with a computational complexity suitable for consumer 
electronics application, i.e. comparable to [5, 6]. 

The most striking characteristic of the object motion estimator described 
earlier in [1], is that no effort is put in segmenting the image into objects prior to estimation 

25 of the model parameters, like in other prior art object motion estimators. Basically, a 

relatively small number of interesting image parts is selected, and a number of parallel - 
motion model parameter estimators is trying to optimize their parameters on this data set. As 
soon as one of the estimators is more successful than another in a certain number of 
interesting image parts, it is focused on those parts, whereas the remaining estimators focus 

30 on the other parts. In short: individual estimators try to conquer image parts from one 

another, dividing the total image into "objects". This prior art object motion estimator allows 
a real-time object-based motion estimation and can advantageously be used in the film 
detection technique of the present invention. 
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Fundamentally, such an object-based motion estimator that wastes no effort in 
expensive segmentation of the image should be able to compete in operations count with a 
block based motion estimator, as one should expect less objects than blocks in realistic 
images. It is only in the assignment of image parts to objects that an effort is required 
5 comparable to the evaluation of candidate vectors on block basis. If the number of objects 
does not exceed the number of candidate vectors too much, the overhead of an object based 
motion estimator should be negligible. It is assumed here that the motion per object can be 
described with fairly simple parametric models. 

In the following subsections, we shall describe a preferred motion model used, 
1 0 an estimation of motion model parameters, a preferred cost function used, a segmentation 
process and a film mode detection for individual objects within a scene. 

Motion model 

15 To keep complexity low, the motion of each object o is described by a simple 

first order linear model that can only describe translation and scaling. More complex 
parametric motion models are known to persons skilled in the art, e.g., models including 
rotation, and can indeed be applied in combination with the proposed algorithm, but will be 
disregarded here, as we shall introduce a refinement that makes such complex models 

20 obsolete. 

The model used is: 



D 0 (x,n) = 



f s x (o,n) + xd x (p,nf 
js y (p,n) + yd y (p,n\ 



(1) 



25 using D a (x, ri) for the displacement vector of object o at location x = 

the image with index n. It is observed that x is associated with pixel locations. 



in 



30 



WO 02/056597 
Parameter estimation 



7 



PCT/IB02/00019 



Given, a motion model, next its parameters need to be optimized for a given 
object in the image. As stationary image parts occur in almost every sequenqe, we assume the 
5 presence of an 'object o,o> 0', for which motion is described by 0 , the zero vector. Clearly 
no estimation effort is required to make this available. The parameter vectors of additional 
objects o, o > 0, are estimated separately, in parallel, by their respective parameter estimators 
(PE m , m= 1,2,..., M), as shown in Figure 1. 

Figure 1 shows a block diagram of an arrangement with a plurality of 
1 0 parameter estimators PE m (n) connected in parallel to the output of a data reduction unit DR U. 
The data reduction unit DRU is arranged to select a set of interesting image pixels that are to 
be used in the calculations made. Inputs to the data reduction unit DRU are the image at time 
n and said image at time n-L Each of the outputs of the PE m (n) is connected to a 
segmentation unit SU. 

1 5 The output of the segmentation unit SU is fed back to the parameter estimators 

PEm(n) since, preferably, they together perform a recursive operation as will be explained 
below. The end result of the segmentation process is formed by groups of pixels of a scene, 
each group of pixels belonging to a different object .and having assigned to it a different 
motion vector. These output data are supplied to a processing unit PU that is arranged to 

20 detect the type of film source per object and to perform predetermined tasks on the different 
objects such as picture rate conversion. The processing unit PU is connected to memory M 
storing predetermined motion parameter patterns used to detect the type of film source as will 
be explained below. The memory Mmay be of any known type, i.e., RAM, ROM, EEPROM, 
hard disc, etc. The output of the processing unit PU, for instance, controls a television screen. 

25 It is observed that the data reduction unit DR U, the parameter estimators 

PE m (n), the segmentation unit SU and the processing unit PU are shown as separate blocks. 
These blocks may be implemented as separate intelligent units having distinct processors and 
memories. However, as is evident to persons skilled in the art, these units may be integrated 
into a single unit such as a general purpose microprocessor comprising a processor and 

30 suitable memory loaded with suitable software. Such a microprocessor is not shown but 
known from any computer handbook. Alternatively, the arrangement shown in figure 1 may 
be implemented as a hard wired logic unit, as known to persons skilled in the art. Preferably, 
the entire arrangement shown in figure 1 is encapsulated as a single chip in a single package. 
Such a single chip package can be easily included in a television apparatus. 
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Each PE m (n) updates a previously estimated parameter vector, after which the 
best parameter candidate vector, according to a cost function, is selected as the result 
parameter vector for that object 

Considering the four parameter model of equation (1), the parameters of object 

5 o, o > 0, are regarded as a parameter vector P 0 (n) : 



(s,<P,ri)\ 
Sy{p,n) 
d x (o,n) 

[d y (o,n)) 



(2) 



and we define our task as to select P Q (ri) from a number of candidate 
1 0 parameter vectors C 0 (ri) as the one that has the minimal value of a cost function, to which 

we shall return later on. 

Preferably, the candidates are generated much similar to the strategy exploited 
in [5, 6], i.e. take a prediction vector, add at least one update vector, and select the best 
candidate parameter vector according to an error criterion. Candidate parameter set CS 0 (n) 

1 5 contains three candidates C Q (ri) according to: 



CS 0 (n) = jc, (n)\C 0 (n) = ? 0 {n-V) + mU 0 U 0 (n) e US 0 (ri), m = - l 5 0,l} 



(3) 



with update parameter U 0 («) selected from update parameter set US 0 (n): 



20 



US B (n) = 



0 
0 



i 
0 

10. 



0 

i 



0 
0 



(4) 



(1=1,2,4,8,16) 



25 
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Given the motion model and some candidate parameter sets, we need to select 

the best candidate, according to a cost function, as the result for a given object The cost 
5 function can be a sum of absolute differences between motion compensated pixels from 

neighboring images, with vectors generated with the (candidate) motion model. However, we 

need to know the area to which the motion model is to be assigned. The two issues, 

segmentation and motion estimation, are inter-dependent. In order to correctly estimate the 

motion in one object, the object should be known and vice versa. 
10 As a first step in the motion estimation process, we define a set with pixel 

blocks of interest. These form the set SI(n) of "interesting" image parts that will be used as a 

basis for optimization of all parametric models. 

Now, the focus of the individual parameter estimators has to be on different 

objects. To this end, each parameter estimator PE m (n) will calculate its cost function on the 
15 same set of interesting locations defined in set SI, giving different locations a different weight 

factor, W Q (X) . Here, X is associated with a position of a block of pixels. The proposed 

algorithm is straightforward: 

♦ The pixel values are multiplied with a first weight factor larger than 1, e.g. 8, in 
case the pixel in SI(n) belonged to the same object, i.e. the same parameter 

20 estimator, according to the previous image segmentation step. 

♦ The pixel values are multiplied with a second weight factor smaller than 1 , e.g. 

0. 1 , in case the segmentation assigned the position to another parameter estimator 
and this estimator achieved low match errors. 

25 Figure 2 gives an example of a selection of pixel blocks of interest in an image 

with a single moving object, i.e., a bicyclist, and a moving background. This selection is 
carried out by the Data Reduction Unit DRU. Thus, the Data Reduction Unit renders a set of 
most interesting pixel elements {SI), resulting in a rather cheap (few calculations) and an 
effective parameter estimation. Figure 2 shows screen photographs illustrating a process of 

30 selecting points of interest on which the parameter estimators optimize their parameters. The 
temporal difference image, between two successive pictures, is not actually calculated, but it 

serves to understand why the high match errors of the vector 0 , i.e. the total set with points 
of interest, are at the positions shown in figure 3C. In figure 3D it is shown how, in this 
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example, the focus of two parameter estimators is divided over the points of interest. I.e., 
figure 3D shows that there are two different motion models detected. The two sub-sets are 
shown in a different brightness, i.e., one in black and the other one in grey. 

The moving background of the image is object o = 1, and the bicyclist is 
object o = 2. There are two parameter estimators that are both optimized on the same set 
containing the blocks of interest, but as soon as one estimator is selected in the segmentation 
to be best in an area, the pixel block of interest in that area is emphasized in the cost function. 
After a while, this converges to the situation illustrated, where one estimator focuses on the 
grey blocks and the other on the white pixel blocks in SI(n) . 

More formally, the cost function is calculated according to: 

USJ 

where F s (x, ri) is the luminance value of a pixel at position J in a sub- 
sampled image with index n, and C 0 (x, n) is the vector resulting from candidate model 

C 0 (ri) at position x . 

The sub-sampling effectively reduces the required memory bandwidth. Images 
are sub-sampled with a factor of four horizontally and a factor of two vertically on a field 
base, generating a sub-sampled image F s (n) from each original field F(n). In order to achieve 
pixel accuracy on the original pixel grid of F, interpolation is required on the sub-sampling 
grid. 

Recursive segmentation 

The segmentation is the most critical step in the algorithm. Its task is to assign 
one motion model to each group of pixels. For each block, a block match error, s Q (X, n) 
corresponding to each of the estimated parameter vectors, P 0 , can be calculated according to : 



The temporal instance where this segmentation is valid is defined by a. 



